Phase-Encoded Speech Spectrograms

نویسنده

  • Chandra Sekhar Seelamantula
چکیده

Spectrograms of speech and audio signals are time-frequency densities, and by construction, they are non-negative and do not have phase associated with them. Under certain conditions on the amount of overlap between consecutive frames and frequency sampling, it is possible to reconstruct the signal from the spectrogram. Deviating from this requirement, we develop a new technique to incorporate the phase of the signal in the spectrogram by satisfying what we call as the delta dominance condition, which in general is different from the well known minimum-phase condition. In fact, there are signals that are delta dominant but not minimum-phase and vice versa. The delta dominance condition can be satisfied in multiple ways, for example by placing a Kronecker impulse of the right amplitude or by choosing a suitable window function. A direct consequence of this novel way of constructing the spectrograms is that the phase of the signal is directly encoded or embedded in the spectrogram. We also develop a reconstruction methodology that takes such phase-encoded spectrograms and obtains the signal using the discrete Fourier transform (DFT). It is envisaged that the new class of phase-encoded spectrogram representations would find applications in various speech processing tasks such as analysis, synthesis, enhancement, and recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Reconstruction from Binary Masked Spectrograms Using Vector Quantized Speaker Models

Several source separation techniques use binary masking on spectrograms to separate two or more speakers from each other. In this thesis, the possibilities for obtaining the best quality signal, reconstructed from masked spectrograms through vector quantized models of speakers, is investigated. The advantages and disadvantages of such an approach are examined. Additionally, the task of signal r...

متن کامل

Probabilistic Inference of Speech Signals from Phaseless Spectrograms

Many techniques for complex speech processing such as denoising and deconvolution, time/frequency warping, multiple speaker separation, and multiple microphone analysis operate on sequences of short-time power spectra (spectrograms), a representation which is often well-suited to these tasks. However, a significant problem with algorithms that manipulate spectrograms is that the output spectrog...

متن کامل

Deep Transform: Cocktail Party Source Separation via Complex Convolution in a Deep Neural Network

Convolutional deep neural networks (DNN) are state of the art in many engineering problems but have not yet addressed the issue of how to deal with complex spectrograms. Here, we use circular statistics to provide a convenient probabilistic estimate of spectrogram phase in a complex convolutional DNN. In a typical cocktail party source separation scenario, we trained a convolutional DNN to re-s...

متن کامل

非負矩陣分解法於語音調變頻譜強化之研究(A study of enhancing the modulation spectrum of speech signals via nonnegative matrix factorization)[In Chinese]

In this paper, we propose to enhance the modulation spectrum of the spectrograms for speech signals via the technique of non-negative matrix factorization (NMF). In the training phase, the clean speech and noise in the training set are separately transformed to spectrograms and modulation spectra in turn, and then the magnitude modulation spectra are used to train the NMF-based basis matrices f...

متن کامل

Speech reconstruction from human auditory cortex with deep neural networks

We examined the accuracy of the reconstructed speech spectrograms from neural responses recorded intracranially in human auditory cortex. Electrodes were implanted over the cortex of epilepsy patients for the localization of seizures, and neural responses were recorded as the subjects passively listened to continuous speech. We compared the reconstructed spectrograms estimated with two differen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016